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Abstract 

This article describes a Bayeslan framework for estimation 
In Item response models, with two-stage prior distributions on 
both Item and examinee populations^ Strategies for point and 
Interval estimation are discussed, and a general procedure based on 
the EM algorithm Is presented* Details are given for 
implementation under one-, two-, and three-parameter logistic IRT 
models* Novel features Include minimally restrictive assumptions 
about examinee distributions and the exploitation of dependence 
among item parameters in a population of interest* Improved 
estimation in a moderately small sample is demonstrated with 
simulated data* 

Key words: Bayeslan estimation 
EM algorithm 

Hierarchial prior distributions 
Item response models 
Marginal maximum likelihood 
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Introduction 

Simultaneous estimation of many parameters can often be 
improved, sometimes dramatically so, if it is reasonable to consider 
one or more subsets of parameters as exchangeable members of 
corresponding populations (Efron & Morris, 1975; James & Stein, 
1961; Kelley, 1927; Lindley & Smith, 1972). The idea is that while 
each observation may provide limited information about the 
parameters it is modeled directly in terms of, it also contributes 
information about the copulations to which they belong. Knowledge 
about the populations, generally superior to knowledge about 
individual parameters, can in turn be brought to bear in the 
estimation of any individual parameter. Novick et al. (1972) and 
Rubin (1980), for example, provide Bayes and empirical Bayes 
solutions respectively to the problem of predicting student 
performance in a given law school when data are available for 
several law schools. Both studies obtained more stable estimates in 
small schools and improved cross-validation results when compared to 
independent estimation within schools* 

Analogous procedures for the IRT setting have begun to appear 
in the psychometric literature. Bock and Aitkin (1981), Rigdon 
and Tsutakawa (1983), and Thissen (1982) address the problem of 
incidental examinee parameters by integrating over a population 
density to produce marginal likelihood functions for item 
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parameters. Reiser (1981) and Mlslevy and Bock (1981) extended 
this model by positing prior distributions for item parameters. 
Swamlnathan and Glfford (1982, 1984, in press) employ two-stage 
priors for examinee parameters and selected item parameters, then 
obtain the joint posterior mode for all individual parameters. 
Andersen and Mad&en (1977), Mlslevy (1984), and Sanathanan and 
Blumenthal (1978) provide maximum likelihood solutions for the 
parameters of examinee population distributions, cooiltional on 
item parameters. Finally, Bock and Aitkin (1981) and Bock and 
Mlslevy (1982) derive posterior means and standard deviations of 
the parameters of individual examinees, conditional on item and 
examinee population parameters. 

The aformentioned procedures can all be expressed as special 
cases of a mora comprehensive Bayesian framework for estimation in 
item response models. Working along lines first "Suggested by 
Lindley and Smith (1972), we begin by introducing a model for item 
responses that employs two levels of prior distributions on both 
item and examinee parameters. A general discussion of theoretical 
and practical considerations in estimating the parameters of such a 
model, including an EM computing algorithm (Dempster, Laird, & 
Rubin, 1977), follows. Procedures specific to logistic item 
response models (Blmbaum, 1968; Rasch, 1960a; Lord, 1980) are then 
detailed. We illustrate the techniques with simulated data and 
conclude by discussing possible extensions of the procedures. 
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The General Form of the Model 
Let e denote examinee ability and p(e|T) its density, 
conditional on examinee population parameters t. If 6 follows a 

m 

2 

normal distribution, for example, x - (Vqi^q)* the mean and 
variance. T is assumed in turn to follow density p(t). In the 
same manner, let C denote the parameterCs) of a test item aud 
pU|n) denote its density, conditional on item population 
parameters n; n in turn follows density p(n). Independence over 

mm 

examinees and items is assumed, given t and n* 

m «v 

Let d^j take the value 1 if examinee i is administered item j 
and 0 if not. For n items of interest, let d^^ - ^^j^j » »^in^ » 
for N examinees, let D - (dj,...,d^)» Let u^^^ denote the response 
of examinee i to item j, taking the va*ue 1 if the item was 
adminstered and answered correctly, and 0 otherwise; define and 
U in analogy to d^^ and D. Denote by L(U|D,e,e) the likelihood of 
of the possibly incomplete matrix of responses of subjects with 

abilities 9 ■ (Bj 6^^) to items with parameters 5 ■ (5j • ,5^^). 

By Bayes theorem, the posterior density of 6 , ^ , t , and n, given 

m m m m 

realized observations U is given by 

m 

p(9,5,T,n|D,u) « l(u|d,9,5) • p(e|T) • p(t) • p(5|n) • p(n) . 

MMMWWA* m m m m mm m mm m 

(2.1) 
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After the forms of the likelihood function L and the prior 
densities p(9|t) and p(C|n) have been chosen, the highest level 
prior densities pd) and p(n) have been specified, and the data U 
have been observed, (2.1) contains all information available about 
the parameters in the model. The sheer incomprehensibility of a 
joint distribution of possibly thousands of variables, however, 
demands summary in terms of salient attributes, to be used in 
constructing point and interval estimates, for exairple. 

The joint mean of the posterior has the desirable property that 
tae value for each component retains the same value in any marginal 
distribution obtained by integrating (2.1) over any subset of 
remaining components. Posterior modes, which do aot exhibit this 
invariance, are more often seen in practice in complex problems fuch 
as the one at hand, since they prove easier to obtain. Generally 
speaking, a parameter's marginal posterior mode is a better 
approximation of its posterior mean than is its joint mode (0*Hagan, 
1976). This is especially so when "nuisance** parameters appearing 
in the joint posterior, along with the parameters of interest, are 
poorly determined. Examinee parameters 9 follow this 
description in the present context, and we shall integrate over 
their distribution routinely to obtain 
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pa,T,n|D,u) -/ p(e,e,T,n|r,u) ae 



(2.2) 



The reduction in dimensionality thus achieved assures that the 
marginal modes of the remaining item and population parameters will 
better approximate their means* 

In principle, it is also possible to integrate over item 
parameters as well in order to obtain the marginal distribut1.ons of 
item and examinee population parameters alone: 



The numerical integration required to effect (2.3), however , is not 
tract ible for any but trivial problems with currently available 
computing machinery. An alternative suggested by Leonard (1982) 
is to approximate the marginal density of t and n as follows: 



p(T,n|D,u) « / / p(9,5,T,n|D,u) de d5 
? e 



(2.3) 



p(T,n|D,u) - p(T,n,£ - £ „|d»u)|h| 



-1/2 
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where 



2 

i log p(e,n,T) 



H - 



with i denoting the nodal value of C from (2.2), evaluated at 
particular values of t and n . In practice one would evaluate thij 
expression at a grid of possible values of x and n in order to 
approximate their posterior marginal density, subsequently obtaining 
the mean and variance if desired. The approximation has the effect 
of replacing the integration in (2.3) with condicional 
maximizations, one for each point in the grid. 

If item population parameters are not of interest, thev can 
also be integrated out to yield 



p(C,t|d,u) « / ; p(e,c,T,n |d,u) de dn . (2.4) 
- - - - n e 



The remaining item parameters and examinee population parameters are 
typically of primary interest in the educational setting, although 
for many examinees and all but very short tests, their marginal 
modes under (2.2) and (2.4) will differ little. 
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An Q! Algorithm for Parameter Estimation 
This section provides a framework for parameter estimation in the 
general model outlined above, based on a variation of Dempster, Laird, 
and Rubin's EM algorithm introdu^^d by Bock anc'. Aitkin (1981) In the 
context of marginal max^nmm likelihood (MML) estimation item 
parmeters. The posterior density function in our model, marginalized 
with respect to 9 , can be written as 

pa,T,n!D,u) -{ / L(u|D,9,e) p(e|T) d9} • {p(T)p(c|n)p(n)} . 

9 - - - 

(3.1) 

The first bracketed expression on the right takes the form of the 
marginal likelihood of observed responses from a random sample of 
examinees from a population with density p(9|t), while the second 
can be thought of as the prior distribution for C and We now 
focus our attention on the first teriu. 

By maximizing the first term of (3.1) with respect to 
parameters of interest, Bock and Aitkin (1981) obtain MML estimates 
of e given p(e|T) and Mislevy (1984) obtains MML estimates of x 
given L(U|d,9,C). Both presentations employed the expedient of 
approximating integration over 9 by summation over a finite grid of 
points X , q « 1 Q, with associated weights A(X |t) as follows: 
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q 



(3.2) 



Three methods were suggested for specifying points and weights. 
First, iihen p(6|t) takes the form of a normal density or a mixture 
of normal densities, optimal points and weights for a given Q may be 
found in Stroud and Sechrest (1966). Second, a Monte Carlo approach 
generates a random sample of equally*^eighted points zrom p(6|t). 
Thirds a grid of Q equally**8 paced points can be specified a priori 
and assigned weights proportional to p(X^|t), 

Bock and Aitkin (1981) show that with the discrete 
approximation of the likelihood function > partial derivatives of the 
marginal likelihood, in which 6*s are not obperved but must be 
inferred from item responses , can be written in forms quite similar 
to their counterparts in » related ''complete data" problem in which 
individual 6 *s are kno%m« Under the assumption of iid 0*s, we may 
write the partial derivative of the complete data log likelihood, 
namely 



with respect to a typical parameter v from ^ or t in the form 



log L(u|D.e,5,T; - log L(u|D,e,C) + log p(e|T) 



(3.3) 
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0 log L(UiD,e,5,T) 



(3.4) 



for an appropriately defined gradient function f^, where N^^ Is the 
number of attempts to Item j by examinee 1 and r^^ Is the number of 
those that are correct. It can be shown (e.g., Mislevy, 1984) that 
the corresponding derivative of the marginal log likelihood (3.2) 
can then be approximated as 



3 log L(U|D,5,t) 



3v 



- Z f (r^.N ,X ,5,T) 
V .q -q q - - 



q 



(3.5) 



where 



'qj '\ '^Ij'M^l'fl'!'!^ 



(3.6) 



and 



^qj 'ij"ij'^L"l'fl'!'I> 



(3.7) 



with 
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L(u. |<i.,X OA(X^|t) 



P(X |u^.d^,5,T) -z L(uJd,,X .5)A(X It) * 



(3.8) 



An application of Bayes theorem will be recognized In (3.8), yielding 

a value approximately proportional to the posterior density of 6 

given u^, d^, C» and T. The upshot Is that the first derivatives 

(3.5) of the marginal likelihood are Identical In form to the first 

derivatives (3.4) of the complete data likelihood, with expressions 

for subjects evaluated at 6^^ with observed data v^^ and N^^j replaced 

by similar expressions evaluated at quadrature points with 

pseudo^data r^^ and N^j • Likelihood equations are obtained by 

setting the partial derivatives (3.5) to zero. 

It will be noted that r and N depend on and t. Solution 

-q -q *^ . « 

must proceed Iteratlvely In EM cycles, which, with Integration 
approximated by summation, take the form described by Dempster et al. 
(1977, Section 4.1.1) for missing values under multinomial sampling. 

In the E-step, (3.6) and (3«7) are evaluated with provisional 

-^t ''t - - 

estimates of ^ and T . This gives the expectations of r and N 
mm «q -q 

conditional on the data and the provisional parameter estimates. 

'^t+^ '^t+l 

In the l2-step, ^ ' and T are obtained by solving (3.5) with 

m m 

with r and N treated as known. Cycles continue In this manner 
until changes become negligible. An Indication of the precision 
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or estimation is given by the following approximation of the Fisher 
information matrix; 



3 log L(u^|d^,5,T) 3 log L(u^|d^,5,r) 

" " i ^ WTV ^ » 

(3.9) 



evaluated at (^,t). 

The EM algorxthm is readily extended to Bayes modal estimation 
(Dempster et al« , 1977, p. 6). All of the foregoing procedures are 
applied as before, except that the marginal likelihood equations 
(3.5) are replaced by so-called "Lindley equations'*; for a typical 
element v of 4 or t, 

3 log p(e,T|D,U) 3 log p(U|d,5,t) 3 log p(?|n) 



0 - 7i - r + 



av 3v 3v 



3 log(T) 



+ 3^ • (3.10) 



The treatment of item population parameters n, which do not appear 
in (3.5), depends on whether they are to be integrated out or jointly 
estimated. Integrating them out modifies the form of the prior for 
C from p(5in) to / p(S|n)p(n) dn* Estimating them requires the 
solution of additional equations 
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3 log p(e|n)p(n) 



0- r— . (3.11) 

an 



Under regularity conditions, posterior densities In Bayes 
estimation tend to multivariate normality as sample size Increases. 
Asymptotically, the mean Is equal to the mode, which Is equal to 
the maximum likelihood estimate. The precision matrix, or the 
Inverse of the covarlance matrix. Is given by the negative matrix 
of second derivatives of the log posterior, evaluated at that point. 

When n has been Integrated out In the problem at hand, this 
matrix takes the form 



3^ log L(U|D,e,T) 3^ log p(C)p(T) 
3(C,T) 3(e,T)' 3(C.t) 3^,1^ ^^-^^^ 



where 



pU) - / p(c|n) p(n) dn 



Employing the well-known result on Fisher's information matrix 



2 

3 log(data|x) 3 10fe(dc^ta|x) 3 log(data|x) 

« 3«3,' " ' ■ = '< — - )( 5?—^ 
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1, Kendall & Stuart, 1973, pp. 8-10) and substituting observed values 
for expectations, we avoid calculating second derivatives of the 
log likelihood via the approximation 



a log pa)p(T) 

m m 

f " " " " 3«,T) 3(5,T)' 



(3.13) 



where H Is given in (3.9). When n is esclmaced jointly with K and 
T, the precision matrix is similarly approximated as 



3 log p(T)p(5|n) 

m mm 

3(5,1) 3(5,T)' 

3^ log p(T)p(5 |n)p(n) 
3(5,T) 3n' 



(symmetric) 



3^ log p(T)p(5|n)p(n) 
3n dn' 



(3.14) 

It should be pointed out that solutions of the Lindley equations 
are local extrema or saddle points of the posterior. Whether they 
are local maxima can be determine by examining the shape of the 
posterior in the neighborhoods of solutions, either empirically or 
though the matrix of second derivatives, which will be negative 
definite at local maxima. Whether a local maximum is a global 
maximum follows in certain cases from the form of the posterior 
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(e.g., a member of the exponential family), but muat be determined 
empirically in most cases by starting the iterative solution from 
a aumber of different initial values. 

Procedures for Some Logistic Models 

The balance of the article implements the procedures in the 
context of logistic item response models. The following sections 
provide details on functional forms for the likelihood and prior 
distributions, and on the corresponding forms of the fitting 
equations. For the first stage of priors, a multivariate normal 
density will be posited for item thresholds, log slopes, and logit 
asymptotes; both a mixture of normal components and a nonparametric 
approximation in the form of a histogram will be provided for 
examinee abilities. For the second stage, both diffuse and natural 
conjugate priors will be provided in all cases* 

The Likelihood Function 

The three-parameter logistic model for dichotomous items 
(Birnbaum, 1968) gives the probability of a correct response to 
item j from examinee 1 as 



Pj(e^) « P(u^j - l|e^,aj,bj,Cj) 



Cj + (1 - Cj)f [Daj(e^ - bj)] , (4.1) 



o 19 
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where ¥(x) is the logistic function 1/(1 + exp(-x)). D is a scaling 
constant, taken as 1 by some writers for convenience and as 1*7 by 
others (e.g., Bimbaum, 1968) so that the units of the nodel will 
approximate those of normal ogive IRT models (Lord, 1952). One may 
obtain the two-parameter logistic model from (4.1) by fixing c^ - 0, 
and the one-parameter model (Rasch, 1960) by additionally fixing 

Indetermlnacies of scale and origin are apparent in (4.1). If 
for any scalars m and x we define 6* * m6 + x, b^' * mb + x, and 
a* « a/m, then P(u - 1 |e*,a*,b*,c) - P(u - l|e,a,b,c). In this 
article we will specify higher-level prior distributions chat 
resolve these indeterminacies* 

Rather than obtaining a posterior for a, b, and c directly, we 
work with the transformed item parameters 

- log 

3 - b 

j j 

and 

Y, - log(c,/(l - c,)) . 
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It is readily inferred that a^ - exp and - While this 

formulation does not permit the boundary values of 0 and 1 foJ , 
it serves our purposes adequately by allowing c*s arbitrarily close 
to these values. Non-positive a's are also disallowed; careful 
examination of fitted and empirical response curves will obviously 
be required in applications where faulty items and incorrect keys 
can occur. 

Reparameterization achieves two ends. The first is a more 

rapid attainment of large-sample results. The impediment against 

normality represented by the finite range of c, for example , is 

removed by re-expression in terms of Y« The second is convenience 

in specifying higher level prior de'^sitites. With unrestricted 

ranges for all parameters, the imposition of multivariate normal 

priors on parameters within items but independent across items is 

not unreasonable. This may be the simplest way to allow tor the 

possibility of dependence among parameters a, b, and c in a 

population of items. 

Letting ? represent (a, , 3, , Y| a , 3 , Y ), the Llndley 

m ill n n n 

equations for item parameters take the form 



3 log LU,T) 3 log p(5|n) 
0 3g + 3g • (*-2) 



Formulas for the second term appear in the following section. 
Those for the first term are approximated as 
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(».3) 

where N . and r . are given in (3.6) and (3, 7) and 
With 

and 

Given and ^ , the equations (4,?) corresponding to 
parameters of different items are independent* This means that 
the M-step task of finding zeros of (4.2), along with additional 

22 
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Llndley equations for examinee*- and possibly Item-population 
parameters, need not address all 3n equations for Item parameters 
simultaneously* Zeros for the parameters of a given Item may be 
obtained rapidly by methods such as Newton-Raphscn Iterations, which 
require second derivatives of the log posterior, or Davldon- 
Fletcher*Fowrll Iterations, which do not. 

Structures on Item Parameters 
Let the prior distribution on the parameters for Item j be 
given by 5j - " MVN(|j^,I^), where • (y^,yp,yY). 

Hence (yr^^r) plays the role of the Item population parameter n 
In the more general notation of the preceding section. Assuming 
Independence over Items, the joint prior density of Item parameters 
Is then given by 



P(5|y^,r^) « r""'^ n exp{-|Uj -U5)'^^^<^j -y^)} (5.1) 



and the log prior density by 



log pu .1, ) . - f log I - i - - He' ' 



(5.2) 
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The partial derivatives of (5*2) with respect to the parameters for 
Item J are obtained as 



3 log pUly^.^^) 









<\ 


-.,)- 


<\ 






m 


21/ 










-V 


• 








'IS 








(5.3) 



These termi3 are added to the partial derivatives of the log 
likelihood (4,3)-(4,5) and the results set to zero to give the 
Llndley equations for the parameters of item J* 

In IRT models with independent unimodal prior distributions 
on item parameters, the contribution of prior information in the 
Llndley equation for a given parameter depends upon its distance 
from the center of the distribution of parameters of its same type* 
That is» parameters of a given type "shrink" toward a single point, 
namely the mean of parameters of that type, by amounts inversely 
proportional to the information available each individually* 



24 



Bayes Modal Estimation 
22 

It will be seen in (5*3) that under the structure proposed here, 
the contribution of the prior also depends on the distance of the 
item's parameters of other types from the centers of their 
respective populations. Parameters of a given type now shrink 
toward a plane, namely their conditional expectations given the 
values of the items' parameters of other types. 

Let us suppose further that (v^,^^) follows the natural 
conjugate prior distribution for the mutlivariate normal, namely 
multivariate normal for given an*l inverted Wishart for 
(Ando and Kaufman, 1965): 



p(p^,E^)« |E-l|(^l)/2 exp{-i[(p^ ^yj'lt^^^^ 'yj^ 



+ tr zr^H]} (5.4) 



whence 



log P(V^,1^) « - (m + l)/2 log|E^ 



o 25 
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Here b and m Is a scalars (m > 2p for a proper distribution under 
the p-parameter IRT model), Is a vector, and H la a 3-by-3 
positive symmetric matrix — all to be specified In such a way as to 
H corresponds to the covar lance of m - p values of C and y^ 
corresponds to the average of the b values of ^ • 

fhe Indetermlnacles of scale and origin In the two- and three*- 
parameter models can be conveniently resolved at this point by 
specifying that p(y^,^^) Is null everywhere except where l^^^ " 0 

and Ua " 0* Only the latter constraint enters Into the one- 

p 

parameter model. 

If and are to be estimated Jointly with C and partial 
derivatives must first be obtained for all terms In the log 
posterior In which they appear, namely log p(^|yc>^r) (^-Z) and 
log P(V^»1^) <5.5): 



3 log P(5|li^,2^) 3 log PCJJ^,^^) 



+ 



U a J - U^) - b(y^ - y^)} (5.6) 
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and 



3 log pCSly^.^^) 3 log pCw^.^^) 



3E 31 



- I" - (n + m + 1) + S + n(5 - y^Xf - y^)' 



+ biv^ - y^)(y^ - >^)' + H}E^'^ (5.7) 



where 



€ - n"^ E 5. (5.8) 



and 



S - £ (5j - 0(5j - 5)' . (5.9) 



Equating to zero and simplifying yields the Lindley equations 
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(5.10) 



and 



Z. - (n + m + ir^{S + n(X - VfKl - W^)' 



A familiar theme in Bayeslan estimation appears in (5.10), where a 
mean is estimated as a weighted average cf a sample mean and a prior 
mean. It should be pointed out that C in (5*8) will generally not 
be equal to the simple mean of the item parameter estimates that 
would htiVe been obtained under joint maximum likelihood (JMIi) 
estimation. This is because the item parameters are being 
estimated at the same time, and each is shrinking back from its .TML 
value in inverse proportion to the amount of information about it; 
items therefore contribute toward the estimation of the item 
population mean in direct proportion to the amount of information 
about them* 
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To emulate maximum likelihood estimation of and again 
jointly with C and T, one may specify that H • 0 and m - 2, and 
omit the quadratic term involving y^ in and after (5.4) • This gives 
an improper diffuse prior, justifyable along the lines of invariance 
with respect to reparameterization (Jeffreys, 1961) • The partial 
derivatives and Lindley equations simplify in obvious ways* 

If modal values of C and r marginalized with respect to and 
and are desired, these latter parameters may be intef^rated out 
and then Lindley equations for item parameters modified in the 
following manner* Focusing on the relevant terms of the posterior, 
we can write 



« |-(n+iiH-l)/2 ^^p^ . I tr Z^^S + H + n(5 - y^XC ' y^)* 



+ b(y^ - y^)(y^ - y^)']} • (5*12) 



Integration over yields n multivariate*t distribution for y. 
(Ando and Kaufman, 1965): 
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where 



and 



b + n 



S + H 

" " ^ nb 



By using the constant of integration for the tnultivariate-t, \7e 
obtain for the marginal distribution of ^ the following quantity: 



p(0 « !c!^^2 



The terms to be added to the partial derivative of the log marginal 
likelihood to obtain a Lindley equation for now marginalized with 
respect t^ Mr ^nd become 
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3 log p(5) C"^ C - y 



This result is similar in form to (5*3), the contribution when Up 
Ec are estimated jointly with 5« 

Structures on Examinee Parameters 
This section presents details for two types of prior 
distributions on examinee parameters t , namely a nonparameteric prior 

m 

in the form of a histogram and a mixture of homoscedastic normal 
distributions in unknown proportions* The latter choice includes 
the familiar standard normal prior as a special case. 

Recalling the form of the posterior distribution for C> n» and 
T , or 

p(C,T,n|D,u) -{ / L(u|D,9,c)p(e|T) d9} . {p(T)} . {p(5|n)p(Ti)} , 

9 - " " 

(6.1) 



we note Chat (1) conCrlbuClons Co Che Llndley equaClons for t come 
from Che marginal likelihood and from iCs prior and (2) chese 
conCi'ibuCions are Che same regardless of whecher n is being 
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estimated jointly or integrated out. Both partial derivatives and 
Lindley equations for t are presented here, the former because they 
are needed to approximate the information matrix and the latter 
because the partial derivatives often simplify after being equated to 
zero. Detailed calculations of the contributions from the marginal 
likelihood are omitted, as they may be found in Mislevy (1984). 
A nonparametric solution ! If p(6|t) is a smooth continuous density, 
it may be approximated by a discrete distribution over a finite 
number of points X , q ■ 1,...,Q. Letting p denote the density 
at point X^, we approximate the log marginal likelihood as 

N 

log L(U|D,C,t) - I log h(u. ) (6.2) 
- - - - 1-1 

where 

Q 

h(u^) - Z L(u^|d^,X ,Op . 
q-1 

The continuous density p(6|t) is thus replaced by a multinomial 
distribution with parameters p. ,...,p^. , with 
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Q-1 



It can be shown chat Che partial derivative of (6.2) with respect 
to p is 

q 



i-i2S_k . z h-^u,)[L(uJd,.X p - UuJd,.X O] . 
*'q 1 

The natural conjugate prior for the nultlnonilal is the 
Dlrchlet distribution, which takes the following form: 



M^-1 

p(Plf-»pQ.l|Mj»...»MQ) * Pfc 



which implies that 



3 log p(p!m) ^ . j 
9 2 



q ' l,fQ - 1 



Prior belief about pp«««9PQ are thus expressed as values of the 
proportions (Mj - D/m"*",.. . ,(Mp - D/m"*", where m"*" - LM^ - K 
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The forms given above provide first derivatives that lead to 
a positive definite matrix of second derivatives ^ and are thereby 
useful In estimating parameters by Newton or quasl-Nevton algorithms 
and In computing posterior variances. Simpler and more Intuitively 
appealing Llndley equations result, however, for all Q p*s with 
their sum restricted to unity: 



where 



N + (M - 1) 

p --3- 9t , q - 1,...,Q , (6.4) 

^ N + M 



Z L(u |d ,X ,5,p)p 

^ «*1 «1 m^ m ^ ^ 

Z Z L(uj^|dj^,X^,5 ,p )p^ 
r 1 * * 



The posterior density at point X^, therefore, Is a weighted average 
of Its prior density and the expectation of Its density conditional 
on the data and the densities themselves. 

To obtain maximum likelihood estimates, we may take a uniform 
diffuse prior with i 1. An alternative diffuse prior with 
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i -1 may be preferred, however, on the grounds of robustness with 
respect to the choice of quadrature points (Novlck & Jackson, 1974, 
p. 347 ff.). 

It Is possible to resolve the Indetermlnacles of the IRT model 
at the polct, by specifying that the distribution p(p|M) can take 

m m 

nonzero values only when the following equality constraints are 
satisfied: 

£ X p « 0 
q 

and 

£ X^p - 1 
q 

Values of M specified In an Informative prior should satisfy these 
constraints as well. 

A mixture of normal component s # Suppose that the distribution 

Is a mixture of K normal components, with means y - (\x^ y^) 

2 

and common variance a • Let p « (Pp***»P|^) be the unknown 

proportions of the mixture* Define the marginal probability of 

2 

response pattern u given ^ and t ■ (y>p,o ) as 
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h(u) - z p. / L(u|d,e,5,T)f. vft) de , 



where 



1 - V 



The log marginal likelihood for N examinees is then written as 



log L(U|D,C,t) - Z log h(u.) . (6.5) 



Approximating integration by summation over a fixed grid of 
equally-*8paced quadrature points X, .•••,X , 



log L(U|D,?,t) - Z log E p^ Z UuJX^)f. (X^) , 
1 k q •iqi^q 



where 



L(u. |X^) - L(u, |d.,X^,5,T) . 
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Taking Pi>***>Pq«i parameters specifying proportions, 

partial derivatives of (6.5) are then obtained as 



9Pfc k q kq K ^ Kq ^ . ^^^^^ ^ _ ^ ^ 



(6.6) 



^.a-^EN (X -u,) . (6.7) 

k q 



and 



30'' 2o^ 2a k q 



where 



A natural conjugate prior for t is Dlrlchlet-nonnal-inverse 
gamma: 
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2 - yfc) 

p(p,y,a'')« { n }{ n exp [ — , ^ 1) 
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2 



io-^-^^*'Kxpi=^} . (6.9) 
20 



whence 



2 



2 "^**k ' ^k^ 
log pCp.y.a") - Z (M. - 1) log p. + Z , 

k * k 20^ 



- (v/2 + 1) log a - (8/20^) . (6.10) 



Here M, y, v, and s are Che parameters of Che prior distribution, 
to be supplied by the user. H can be thought of as the number of 

m 

examinees in each of the components from a sample of size " 
I - 1; y can be thought of as anticipated locations for the 
means of the components, v and s are the parameters of the 
inverted gamma distribution, possibly more easily specified after 
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one has in mind a mean and variance of such a distribution that 

2 

incorporates prior belief about a : 



V , 2 mean ^ ^ 
variance 



and 



mean • variance 

2 

2 (mean + variance ) 



The indeterminacies of the IRT model can also be resolved at 
this point, by specif ving that the total mean and within-component 
variance take specified values, say 0 and 1* That is, p(t) is zero 

m 

except where 



and 



k 



2 

0 
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When K - 1, a standard normal density is effectively specified for 6 
by this procedure. 

Lindley equations are now obtained aa the tfums of partial 
derivatives of the log marginal likelihood (6.5) and the log prior 
(6.10). Again writing equations in terms of K p*8 constrained to a 
sum of one, we obtain 



L R + (M^ - 1) 



k - 1,,...,Q , (6.11) 



N + M 



2 N. X + y. 



(6.12) 



and 



Z Z N. (X 

1 Kq q 



k 

N + (v/2 + 1) 



(6.13) 



A diffuse prior may be obtained from (6.9) by omitting the term 
involving y and setting = 1, s - 0, and v - 0. Partial 
derivatives and Lindley equations simplify in obvious ways. 
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A Numerical Exanple 

Satisfactory procedures for Item parameter estimation have been 
available for some time for both large and small samples under the 
one*parameter logistic (Rasch} IRT model and for large samples of 
both persons and Items under the threen>arameter logistic (Blmbaum) 
IRT models. The same cannot be said about small samples under the 
three-*parameter models and It Is to tnls problem we apply the 
procedures of tne proceeding sections* 

A perusal of the recent literature on Bayeslan Item parrmeter 
estimation suggests that such efforts were motivated not so much b/ 
th» pursuit of m:<.nimum mean squared error or by a convli^tlon thaL 
all unknowns should be expressed In probabilistic terms , but rather 
by a more practical desire to obcaln '^reasonable" Item parameter 
estimates — In particular, finite ones. 

j.ae essential difficulty with parameter estimation under the 
three*- parameter model is that the parameters of a given item are 
often poorly determined by the data at hand; appa)ently discrepant 
triples (a, b, c) can trace similar response curves in the region of 
the ability scale where tne sample of examinees is to be found. 
Such poor resolution is manifest as a likelihood surface nearly flat 
along one or more dimensions, yielding unstable maximum likelihood 
estimates (MLE*s). A trivially higher likelihood mcy be produced, 
for example, by taking a (.articular item's values of a and c to be 
203 and .6 rather than the more reasonable values of 2 and .25. 
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Extreme and infinite parameter estimates can be avoided by 
using a single-stage Bayesian prior, but not without introdur ng an 
additional hasard. A £ully*speei£ied prior will Indeed have the 
desired effect of pulling extreme but ill-*determined values wjirard 
the center of the prior distribution. If the prior has been poorly 
specified, however, this center may be far from the actual center of 
the parameter values of interest; estimates of all such parameters 
will be biased in the same direction* These **ensemble biases'* have 
serioup Implications for subsequent estimation of examinee individual 
or population parameters, for while such estimation is resistent to 
random errors in item parameters, it reflects in direct measure 
systematic errors In a*s and b*s, and, through the systematic errors 
in a*8 and b*8 they imply, systematic errors in c*s as well* 

As a means of overcoming these difficulties, one may introduce 
the second-stage prior distributions* Experience suggests that item 
responses of small samples of examinees (less than 2000, say) to 
short tests (less than AO items) provide sufficient information to 
approximate the central tendencies or item parameters through y^, 
so that its prior may be diffuse, but not to estimate the 
convariance matrix so that its prior must be informative* The 
BILOG computer program (Mislevy & Bock, 1982), for example, fixes 
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at user-specified values, so that item paramc*ter8 shrink toward the 
center of their distribution at rate controlled by the user, but 
that center is estimated from the data* 

Some of these effects can be illustrated with an analysis of a 
simulated data set, with responses of 1000 simulated examinees 
selected at random from a unit normal population to 20 test items* 
The parameters of the items were also generated from independent 
normal distributions; for the a • log a, the mean and variance were 
0*0 and ,5; for 3 ■ b, ,5 and KO; and for Y - logit c, -K39 and 
•16« Item parameters were estimated in two ways: 

K Marginal maximum likeihood (MML). Using the BILOG computer 
computer program, the following likelihood equation was 
maximized with respect to item parameters ^ and weights p 
at ten equally spaced quadrature points between *-4 and 
+4: 

L - n Z p(u^|a,3,Y,X )p 

- n / p(uJa,3,^,e) g(0) de . 
i e 1 - - - 

1. Bayes estimation. To obtain Bayes modal estimates of item 
parameters, a posterior of similar form was maximized: 
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P(a,6»Y.P,Mp|U,r ) - n Z p(u |o,6,Y,X ) • p 



This Is the ''floating priors" option of BILOG; the mean 
vector Up of item parameters is estimated concurrently 
with the item parameters themselves, but a prior covariance 
matrix is supplied by the user. BILOG defac^t values of 
1.00, 4*00, and 0.25 were employed for £^^, ^g^, and l^^* 
(These values are intended to hn sufficiently mild to 
affect mat parameters minimally when the data supply 
information about them, but keep all estimates within 
"reasonable** ranges.) Off-diagonal elements of £. were 
set at zero. 

The value of -2 log L under the MhfL solution was foupd to be 
22,295, while the value obtained by substituting the Bayes estimates 
into the likelihood function was 22,300. This trivial difference 
implies that the Bayes estimates explain the observed data nearly 
as well as MML estimates*. 
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Indeed » with a few exceptions (more on these below), MML and 
Bayes estimates of a and 0 were quite similar, with a*s tending 
to be shrunken slightly toward their estimated mean of •21« 

Estimates of asymptotes were more significantly affected, as 
seen In Figures 1 and 2. These figures plot generating and 
estimated values of c, MML and Bayes solutions respectively, against 
generating values of the quantity b 2/a, a huerlstlc Index based 
on the observation that less Information Is obtained about c as 
Items become easier or less reliable (Lord, 1975). Items with high 
values of this Index are seen to have estimated c*s near their 
generating values under both estimation procedures, but certain 
Items with low values are regressed strongly toward the estimated 
mean of about •21 • To anthropomorphize, we might say that the 
Bayes solution felt true c*s for these items were probably more 
similar jo the c*s that It could estimate well than to the atypical 
and unstable MML values based on sparse information* 

Insert Figures 1 and 2 about here 

It is instructive to consider the estimated a*3 and b's of 
these items, to see how item parameters can "trade off" against one 
another* Values for the six items showing the largest differences 
between MML and Bayes estimated c*s are shown in Table 1* The 
following results may be observed: 
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Insert Table 1 about here 

Item 1 Is relatively easy, so that the Increased c value 
obtained by the Bayes solution has little effect on the estimated 
a and b. As It turns out, the generating c for this Item was lower 
and more atypical than either NML or Bayes obtained, but since most 
of the examinees were well above the chance level, it did not really 
matter. Item 4 is similar, in that a large degree of shrinkage of 
the estimated c on an easy items has little effect on the other 
parameters. This time (and, the model assumes, more often than not) 
the Bayes estimate is closer to the true value. 

Item 2 shows an extremely high c under MIL shrunken back by 
Bryeb procedures to a lower, more nearly correct, value. While the 
estimated a*s are similar, the estimated b under Bayes is 
correspondingly reduced somewhat, again closer to its true value* 
The point here is that spuriously over-estimated c*s induce 
spuriously over*estimated b*s, a result guarded against in two ways 
when priors are enforced on both parameters. 

Items 3 and 6 show items with high MHL a estimates being 
shrunken back toward their mean under Bayes, and extreme c*s 
correspondingly regressed. Both items are relatively eaey, but 
it is seen that pulling down a spuriously high c (item 3) affects 
b whereas increasing a spuriously low c (item 6) does not. 
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Finally, item 5 shows an atyplcally loir c regressing toward 
Its mean, causing a corresponding shift In a away from Its mean. 
The estimated b*s are similar under both models. 

Discussion 

Maximum likelihood (ML) estimation is Justified by its 
asymptotic properties alone. Taking the data for each parameter at 
face value no matter how sparse, ML will often yield infinite or 
implausible parameter estimates in small samples. Thissen and 
Walner (1982) suggest that at least for certain parameters, a sample 
3lze of 10,000 examinees can be a small sample in the context of the 
three*-parameter logistic IRT model; estimation procedures therefore 
stand to profit from the Incorporation of additional information. 
The hierarchical Bayeslan framework given in the present article 
supplied such Informatlrn in a very modest way. In effect, it 
quantifies beliefs such as 

1. if the items for which we can reasonably estimate c*s 
yield values between .1 and .3, then the items for which 
less information is available probably has c*s in this 
range as well; 

2. if most of the items have a's between 1/3 and 3, then the 
a for this particular item is probably not 957; 

3. if all of the other examinees seem to have 6*s between ~3 
and +3, the 6 for this examinee is probably not even 
though he did correctly answer both items he was presented. 
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Such strictures are implied by the assumption that parameters 
belong to respective well-behaved populations, the higher-level 
parameters of which little or nothing need be assumed* The effect 
of this so-called assumption of exchangeability is to "shrink" 
estimates from where they would have been under ML toward the 
centers of the respective populations. (This is always true for 
unimodal prior distributions, though with multimodal priors certain 
parameters may be shrunk toward local modes rather than the global 
mode.) 

When it is not reasonable to assume a common population, 
however, exchangeability is violated. Graphic examples of the 
absurdities that can result are suggested by proponents as well as 
critics of "shrunken" estimators. Should one expect to obtain 
better estimates of the true batting averages of baseball players, 
for instance, by including data on the price of wheat? The point is 
that shrinking estimates toward a common center is justified only 
when a common population best represents the extent of our prior 
knowledge. The imposition of exchangeability across all units, and 
estimation procedures that require it, are not strictly appropriate 
when additional information differentiating the units is at hand. 

It is in fact this latter case that typically prevails in 
educational and psychological measurement. Already known, or 
available more economically than responses from examinees, is 
Information from several sources: 



ERIC 



48 



Bayes Modal Estimaclon 
46 

1* Cognitive processing requirements cf items can be specified* 
at least to some degree* Mental rotation items » for 
example, can be characterized in terms of the number of 
degrees the target object has been rotated; differential 
calculus items, an example discussed by Fischer (1973), 
can be characterized in terms of the derivations rules they 
demand for solution* 

1. Surface features of items can be identified which can 

suggest a need for distinguishing subpopulations of items* 
Free*-response and multiple*-choice items in the same test 
may be distinguished, for example, as may be analogy items 
from vocabulary items in the SAT* 

3* Item content can be often be identified* In a test of 

reading contprehension, one might wish to differentiate items 
associated with narrative passages, poetry, and documents* 

4* Quantitative information, such as percent s-correct from 
pretesting may be available* 

5* Examinees may be differentiated with respect to 

qualitative feature.^ such as sex, educational program, 

or racial/ethnic background; or with respect to quantitative 

variables such as scores on previously-administered tests* 

More comprehensive Bayes ian procedures would provide Ifor the 
utilization of such information* They would also provide for means 



49 



Bayes Modal Estimation 
47 

of determining irtien such information makes material differences in 
item and population parameter estimates. 



50 



Bayes Modal Estlmaclon 
48 

References 

Andersen, E. B. , & Madsen, M. (1977). Estimating the parameters of 
a latent population distribution. Psychoiiietrika , 42 , 357- 
374. 

Ando, A.» & Kaufman, 0. M. (1965). Bayesian analysis of the 

independent normal proce8s~neither mean nor precision known. 
Journal of the American Statistical Association , 60 , 347-358 • 

Bimbaum, A. (1968) • Some latent trait models and their use in 

inferring an examinee's ability. In F. M. Lord & M. R. Novick, 
Statistical theories of mental test scores . Reading, MA: 
Addison-Wesley. 

Bock, R. D. , & Altken, M. (1981). Marginal maximum likelihood 

estimation of item parameters: Application of an EM algorithm. 

Psychometrika . 46, 443-459. 
Bock, R. D., & Mlslevy, R. J. (1982). Adaptive EAP estimation of 

ability in a microcomputer environment. Applied Psychological 

Measurement , 6^, 431-444. 
Uempster, A. P., Laird, N. M. , & Rubin, D. B. (1977). Maximum 

likelihood from incomplete data via the EM algorithm. Journal 

of the American Royal Statistical Society , Series B, 39, 1-38. 
Efron, B. , & Norris, C. (1975). Data analysis using Stein's 

estimator and its generalizations. Journal of the American 

Statistical Association , 70, 311-319. 



•51 



Bayes Modal Estimation 
49 

Fisher, G. (1973)* The linear logistic test nodel as an instrument 

in educational research* Acta Psychologica l 37 , 359-374« 
James, W«, & Stein, C. (1961)* Estimation with quadratic loss* 

Proceedings of the Fourth Berkeley Symposium on Hathematical 

Probability and Statistics (Vol* 1). Berkeley: University of 

California Press* 
Jeffreys, H- (1961). Theory of probability (3rd ed.) Oxford: 

Clarendon Press* 
Kelley, T. L. (1927). The interpretation of educational 

measurements . New York: World Press. 
Kendall, M. G. , & Stuart, A. (1973). The advanced theory of 

statistics (Vol II., 3rd ed.). New York: Hafner. 
Leonard, T. (1982). Comment on *A simple predictive density 

function* by Lejeune and Falkenberry. Journal of the American 

Statistical Association , 77^, 657-658. 
Lindley, D. V., & Smith, A. F. M. (1972). Bayes estimates for the 

linear model. Journal of the Royal Statistical Society > 

Series B, 34, 1-41. 
Lord, F. M. (1952). A theory of test scores. Psychometric 

Monograph , No. 7« Psychometric Society. 
Lord, F. M. (1975). Evaluation with artificial data of a procedure 

for estimating ability and item characteristic curve parameters 

(RB-75-33). Princeton, NJ: Educational Testing Service. 



52 



Bayes Modal Estimation 
50 

Lord, F. M, (1980) • Applications of item response theory to 

practical testing problems * Hillsdale, NJ: Brlbaum. 
Mislevy, R. J, (1984). Estimating latent distributions. 

Psychometrika , 49, 359-381. 
Mislevy, R. J. , 4 Bock, R. D. (1981, July). Implementation of an EM 

algorithm in the estimation of item parameters . Paper presented 

At the IRT/CAT Invitational Conference, Minneapolis, MN. 
Mislevy, R. J., & Bock, R. D. (1982). BILOG ; Item analysis and test 

scoring with binary logistic models [Computer program] . 

Mooresville, IN: Scientific Software. 
Novick, M. R. , Jackson, P. H. , Thayer, D. T., & Cole, N. S. 

(1972). Estimating multiple regressions in m-groups: A cross* 

validational study. British Journal of Mathematical and 

Statistical Psychology , 5^ 33-50. 
0*Hagan, A. (1976). On posterior Joint and marginal modes. 

Biometrika , 63, 329-333. 
Rasch, G. (1960). Probabilistic models for some intelligence and 

attainment tests . Copenhagen: Danish Institute for Educational 

Research. 

Reiser, M. R. (1981, June). Bayesian estimation of item parameters 
in the two-parameter logistic model . Paper presented at the 
annual meeting of the Psychometric Society in Chapel Hill, NC. 



ERLC 



Bayes Modal Estimatlor^ 
51 

Rlgdoti, S., & Tsutakawa, R. (1983). Parameter estimation in latent 

trait models. Psychometrika . 48 , 567-574 • 
Rubin, B. (1980^ • Using empirical Bayes techniques in the law 

school validity sttidies* Journal of the American Statistical 

Society . 75, 801-827, 
Sanathanan, L,, & Blumenthal, N, (1978), The logistic model and 

latent structure. Journal of the American Statistical 

Association , 22, 794-798, 
Stroud, A, H,, & Sechrest, D, (1966), Gaussian quadrature formulas , 

Englewood Cliffs, NJ: Prentice-Hall, 
Swamlnathan,. H, , & Gifford, J, A, (1982), Bayesian estimation in the 

Rasch model. Journal of Educational Statistics , 7^, 175-192, 
Swaminathan, H, , & Gifford, H, A, (1984, in press), Bayesian 

estimation in the three-parameter logistic model, Psychometrika , 
Thissen, D, (1982), Marginal maximum likelihood estimation for the 

one-parameter logistic model, Psychometrika , 47 , 175-186, 
Wainer, H,, & Thissen, D, (1982), Some standard errors in itt^m 

response theory, Psychometrika , 47, 397-412, 



54 



Bayes Modal Estimation 
52 

Tat.e 1 

Generating and Estimated Parameters of Selected Items 



Item 




a 






b 






c 




True 


MML 


Bayes 


True 


MMI 


Bayes 


True 


MML 


Bayes 


1 


1.1 


1.2 


1.3 


-.4 


-.4 


-.3 


.11 


.14 


.17 


2 


.5 


.4 


.4 


.2 


.8 


.6 


.19 


.28 


.24 


3 


.9 


1.5 


1.1 


-l.,3 


-.6 


-1.0 


.26 


.44 


,7.1 


4 


1.4 


1.2 


1.4 


-i.O 


-1.2 


-1.0 


.17 


.03 


.19 


5 


1.5 


2.2 


2.4 


-.3 


-.2 


-.2 


.13 


.12 


.14 


6 


2.5 


4.5 


3.4 


-1.1 


-1.2 


-1.1 


.18 


.03 


.18 
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b - lit 



• generating value of c 
o estimated value of c 



Figure 1. Generating and MML estimated values of c, against 
generating b - 2/a. 
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b - 2/a 



• generating value of c 
o estimated value of c 



Figure 2. Generating and Bayes estimated values of c, against 
generating b - 2/a. 
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